In [26]:
from pathlib import Path
import numpy as np
import pandas as pd
import pyomo.environ as pyo
from gilp.simplex import LP
from gilp.visualize import simplex_visual
More than just ML/AI!¶
- Three types of analytics:
- Descriptive: What happened/is happening?
- Predictive: What will happen?
- Prescriptive: What to do?
- Today I want to focus on prescriptive analytics
Mathematical Optimization¶
- Construct a mathematical model of reality and use it to make decisions
- Three major components:
- Decision Variables: The values we can control to find the best solution
- Objective Function: The function that tells us how well we are doing
- Constraints: The rules we apply to our model
Linear Program¶
- A type of mathematical optimsation
- Linear Programming the big idea: the objective and all constraints are linear in the decision variables
- We can model many problems by assuming linearity
Mixed Integer Program¶
- Linear program with the additional constraint that some of the decision variables are integers
- Usually harder to compute
What does linear mean?¶
- $f(x_1, x_2, \ldots, x_n) = c_1x_1 + c_2x_2 + \ldots c_nx_n + b$
- This is technically an affine function, but is considered linear in optimisation
Examples of linear functions¶
- $5x + 3$
- $3x + 7y -5$
- $2x-y+5z$
Examples of non-linear functions¶
- $x^2$
- $xy$
- $\sin(x)$
What does linear look like?¶
- In 2 dimensions a linear equation is a straight line
- In 3 dimensions a linear equation is a plane
- In 4 or more dimenions a linear equations is hyper plane (don't ask me what that looks like)

Today's problem¶
- I like to eat ...
- ... but I don't want to be unhealthy
- How can I eat the most while still meeting my dietary requirements?
Dietary Constraints¶
- Want to make sure our diet meets nutritional constraints
- Limit dietary intake that is bad for us in excess
- Make sure we get at least enough of others
- Source: https://www.fda.gov/media/99059/download
In [29]:
upper_bound_constraints = {
"Total Fat": 78,
"Saturated Fat": 20,
"Cholesterol": 300,
"Sodium": 2_300,
"Sugars": 50,
}
In [30]:
lower_bound_constraints = {
"Carbohydrates": 275,
"Dietary Fiber": 28,
"Protein": 50,
}
In [32]:
menu = pd.read_csv(
menu_file_path,
usecols=all_cols,
index_col="Item"
)
In [33]:
# Filter out rows with no calories
# This can't be the case as all nutrients contain some caloric value
menu = menu[menu["Calories"] > 1e-6]
In [34]:
menu.head()
Out[34]:
| Category | Calories | Total Fat | Saturated Fat | Cholesterol | Sodium | Carbohydrates | Dietary Fiber | Sugars | Protein | |
|---|---|---|---|---|---|---|---|---|---|---|
| Item | ||||||||||
| Egg McMuffin | Breakfast | 300 | 13.0 | 5.0 | 260 | 750 | 31 | 4 | 3 | 17 |
| Egg White Delight | Breakfast | 250 | 8.0 | 3.0 | 25 | 770 | 30 | 4 | 3 | 18 |
| Sausage McMuffin | Breakfast | 370 | 23.0 | 8.0 | 45 | 780 | 29 | 4 | 2 | 14 |
| Sausage McMuffin with Egg | Breakfast | 450 | 28.0 | 10.0 | 285 | 860 | 30 | 4 | 2 | 21 |
| Sausage McMuffin with Egg Whites | Breakfast | 400 | 23.0 | 8.0 | 50 | 880 | 30 | 4 | 2 | 21 |
Simplifying the problem¶
In [35]:
len(menu)
Out[35]:
244
- We are working in 244 dimensions!
- I can't visualise 244 dimensions :(
- Let's reduce it to 2
In [36]:
restricted_menu = menu.loc[["Big Mac", "Side Salad", "Egg McMuffin"]]
In [37]:
restricted_menu.head()
Out[37]:
| Category | Calories | Total Fat | Saturated Fat | Cholesterol | Sodium | Carbohydrates | Dietary Fiber | Sugars | Protein | |
|---|---|---|---|---|---|---|---|---|---|---|
| Item | ||||||||||
| Big Mac | Beef & Pork | 530 | 27.0 | 10.0 | 85 | 960 | 47 | 3 | 9 | 24 |
| Side Salad | Snacks & Sides | 20 | 0.0 | 0.0 | 0 | 10 | 4 | 1 | 2 | 1 |
| Egg McMuffin | Breakfast | 300 | 13.0 | 5.0 | 260 | 750 | 31 | 4 | 3 | 17 |
Formulating our model¶
Constructing a model in pyomo and HiGHS¶
- The model will be small and can be solved by hand if needed
- However this doesn't scale
- We want a solver that we can give a model to (e.g. HiGHS, Gurobi)
- We want a standardised way of defining models (pyomo)
In [13]:
diet_model_small = pyo.ConcreteModel()
What do we control?¶
- The amount of Big Mac and Side Salad we consume
- Let's call the total amount of Big Mac's we consume $x_1$ and the total number of Side Salads $x_2$.
In [14]:
diet_model_small.x = pyo.Var(
[x for x in restricted_menu.index],
domain=pyo.NonNegativeReals
)
What do we want to optimise?¶
We want to maximise the total number of calories consumed
Mathematically this is equivalent to maximising $c_1x_1 + c_2x_2 + c_3x_3$, where:
- $c_1$ is the number of calories per Big Mac
- $x_1$ is the amount of Big Macs eaten
- $c_2$ is the number of calories per Side Salad
- $x_2$ is the amount of Side Salad eaten
- $c_3$ is the number of calories per Egg McMuffin
- $x_3$ is the amount of Egg McMuffin eaten
This gives us the formula $530x_1 + 20x_2 + 300x_3$
In [15]:
diet_model_small.obj_func = pyo.Objective(
# sum_product is the same as c_1x_1 + c_2x_2 + ... + c_nx_n
expr=pyo.sum_product(menu["Calories"], diet_model_small.x),
sense=pyo.maximize,
)
What are we constrained by?¶
- Amount of Big Mac and Side Salads we can eat without violating the constraints we have
- Actually have two sets of constraints;
- Upper bound constraints; Those we cannot exceed
- Lower bound constraints; Those we must meet
In [16]:
c = restricted_menu["Calories"].values
A = restricted_menu[list(upper_bound_constraints.keys())].T.values
b = np.array(list(upper_bound_constraints.values()), dtype=float)
lp = LP(A, b, c)
simplex_visual(lp).show()
Expanding the model¶
- Solving visually is great
- Until you need to visualise hig
- This is where computers really kick in
Adding the constraints programmatically¶
In [17]:
diet_model = pyo.ConcreteModel()
diet_model.x = pyo.Var(
[x for x in menu.index],
domain=pyo.NonNegativeReals,
)
diet_model.obj_func = pyo.Objective(
expr=pyo.sum_product(menu["Calories"], diet_model.x),
sense=pyo.maximize
)
In [18]:
def prettify(s: str) -> str:
"""Make variable names display more nicely"""
return s.replace(" ", "_").lower()
In [19]:
def add_constraints(model, constraints, sense, menu):
for name, constraint_value in constraints.items():
constraint_name = f"constraint_{prettify(name)}"
if sense == "<=":
expr = pyo.sum_product(menu[name], model.x) <= constraint_value
elif sense == ">=":
expr = pyo.sum_product(menu[name], model.x) >= constraint_value
else:
raise ValueError(f"Invalid sense: {sense}")
constraint = pyo.Constraint(expr=expr)
setattr(model, constraint_name, constraint)
In [20]:
add_constraints(diet_model, upper_bound_constraints, "<=", menu)
add_constraints(diet_model, lower_bound_constraints, ">=", menu)
Solving the model¶
In [21]:
solver = pyo.SolverFactory("appsi_highs")
In [22]:
_ = solver.solve(diet_model)
In [23]:
diet_model.obj_func()
Out[23]:
2329.7374211529996
In [24]:
def get_basic_variables(model: pyo.ConcreteModel) -> dict[str, float]:
basic_variables = {}
for key in model.x.keys():
if abs(model.x[key].value) > 1e-6:
basic_variables[key] = model.x[key].value
return basic_variables
In [25]:
get_basic_variables(diet_model)
Out[25]:
{'Premium Grilled Chicken Classic Sandwich': 1.3574886313627692,
'Kids French Fries': 13.156520463547015,
'Side Salad': 9.171189672876631,
'Nonfat Latte with Sugar Free French Vanilla Syrup (Small)': 1.5998239694880447}
Extensions:¶
- limiting to only having whole items?
- limiting to $n$ items?
- limiting to different times?
- limiting to different categories?